s3: use .temporary lifetime for downloaded body so it gets freed#29923
s3: use .temporary lifetime for downloaded body so it gets freed#29923robobun wants to merge 8 commits into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughAdjusted Blob→FormData lifetime semantics to align buffer ownership for S3-downloaded vs in-memory Blobs. Added a concurrent test that repeatedly exercises S3 body accessors and measures RSS to detect retained buffers or memory growth. Changes
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Review rate limit: 1/5 review remaining, refill in 44 minutes and 42 seconds. Comment |
|
Updated 2:48 PM PT - May 4th, 2026
❌ @robobun, your commit 7463a13 has 3 failures in
🧪 To try this PR locally: bunx bun-pr 29923That installs a local version of the PR into your bun-29923 --bun |
|
Found 3 issues this PR may fix:
🤖 Generated with Claude Code |
|
This PR may be a duplicate of:
🤖 Generated with Claude Code |
There was a problem hiding this comment.
1f4f4fb addresses my last concern (the text-leak fixture is now sequential with a symmetric 30/30 warmup), and all earlier nits are resolved — deferring to a human only because this changes buffer-ownership semantics in Blob.zig and overlaps with #29776, which a maintainer should reconcile.
Extended reasoning...
Overview
The PR fixes a memory leak where every s3file.text()/.json()/.arrayBuffer()/.bytes()/.formData() call orphaned the entire downloaded body. Three small hunks in src/bun.js/webcore/Blob.zig: doReadFromS3's wrapper now passes .temporary instead of .clone; toFormDataWithBytes now defer frees its input on .temporary; and getFormData flips from .temporary to .clone so in-memory Blobs (which pass Store-owned sharedView() bytes through line 3906) aren't freed by the new defer. The rest is tests: a new credential-free s3-body-leak.test.ts covering all five methods via a local mock server, a rewrite of s3-text-leak-fixture.js (sequential, symmetric warmup, settle loop), and a 30→60 s timeout bump in s3.leak.test.ts.
Security risks
None. No auth, crypto, parsing, or trust-boundary changes — purely allocation-lifetime bookkeeping and test scaffolding. The only risk class is memory-safety (use-after-free / double-free), not security.
Level of scrutiny
Moderate-to-high. The diff is small and well-reasoned, but it changes ownership semantics of a heap buffer in core Blob code that every S3 download and Bun.file().formData() flows through. The getFormData .temporary→.clone flip is a compensating change that must stay paired with the new defer free in toFormDataWithBytes; I traced the in-memory path (toFormData → sharedView() → toFormDataWithBytes at Blob.zig:3906) and it's correct, but this is exactly the kind of invariant a maintainer should sign off on.
Other factors
- All three of my earlier inline comments were addressed (formData coverage,
describe.concurrent, and the text-leak fixture flakiness — the latter twice, with 1f4f4fb finally making warmup and measured phases symmetric and sequential as suggested). - CI build #48969 for 1f4f4fb is still in flight; the previous build's
s3.leak.test.tsfailures should be resolved by the fixture rewrite but that's unconfirmed. - Overlaps with community PR #29776 which makes the same core
.clone→.temporarychange — a maintainer should decide which to land / how to credit. - ASAN-clean per the PR description, and
FormData.test.tspasses.
a7dec86 to
0900b96
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/js/bun/s3/s3-body-leak.test.ts`:
- Line 135: Replace the strict expect(stderr).toBe("") assertion with a check
that first strips the known ASAN startup warning from stderr (e.g. filter out
lines matching /^WARNING: ASAN interferes.*$/) and then assert the remaining
string is empty; update the test around the expect(stderr).toBe("") usage
(referencing the stderr variable in s3-body-leak.test.ts) so the test filters
that single known ASAN line before asserting emptiness.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 49b28e85-b850-4233-9406-3b4f9be64d16
📒 Files selected for processing (1)
test/js/bun/s3/s3-body-leak.test.ts
There was a problem hiding this comment.
No issues found on the latest revision — the .temporary semantics check out against all four to*WithBytes consumers and the compensating getFormData flip is correct. Deferring only because this is manual memory-ownership in the core Blob/S3 path (and overlaps with #29776), so a maintainer sign-off seems appropriate.
Extended reasoning...
Overview
The net diff touches two files: three small logic changes in src/bun.js/webcore/Blob.zig (the .clone → .temporary flip in doReadFromS3's wrapper, a new defer free in toFormDataWithBytes for the .temporary case, and a compensating .temporary → .clone flip in getFormData), plus a new credential-free RSS-growth regression test at test/js/bun/s3/s3-body-leak.test.ts. All earlier changes to s3-text-leak-fixture.js were reverted, so the diff is now minimal and focused.
I traced each to*WithBytes consumer under .temporary: toStringWithBytes adopts via toExternalValue or frees explicitly (Blob.zig:3688-3701), toArrayBufferViewWithBytes wraps the buffer in an owning ArrayBuffer (Blob.zig:3865-3871), toJSONWithBytes already freed on .temporary, and toFormDataWithBytes now does too. The getFormData flip is necessary and correct: in-memory Blobs reach toFormDataWithBytes with sharedView() (Store-owned bytes, Blob.zig:3914-3919), which must not be freed; file/S3 paths supply their own .temporary and ignore this value. The change mirrors the existing local-file path (doReadFile → NewReadFileHandler), so the pattern is established.
Security risks
None identified. This is a memory-leak fix; the only failure mode of getting lifetimes wrong here would be UAF/double-free, not a security boundary bypass. I checked specifically for the UAF risk introduced by the new defer free and confirmed the getFormData compensating change covers the in-memory-Blob case, and that FormData.toJS copies slices out before the defer fires.
Level of scrutiny
This is production-critical runtime code: manual memory ownership in the Blob/S3 download path, where a mistake means use-after-free or double-free for every s3file.text()/json()/arrayBuffer()/bytes()/formData() call. The change is small and I believe correct, but per the approval guidelines this is exactly the kind of "touches critical code paths" change where a human maintainer who owns this code should sign off rather than relying solely on bot review.
Other factors
All five of my prior inline comments across earlier revisions were addressed and resolved by the author; the bug-hunting system found nothing on the latest revision; the new test is credential-free, runs concurrently, covers all five body methods, and the author reports 25/25 ASAN passes. There is also an overlapping community PR (#29776) making the same core change — a maintainer may want to coordinate which lands and how the other is closed/credited.
On successful S3 download, S3HttpSimpleTask moves ownership of its heap response_buffer into the S3DownloadResult and zeroes its own copy, so the callback owns the allocation. S3BlobDownloadTask.onS3DownloadResolved forwards those bytes to the to*WithBytes handlers via doReadFromS3's wrapper, which was passing Lifetime.clone. .clone assumes the bytes are backed by the Blob's Store, but an S3 Store only holds path/credentials. Every handler either copied the buffer and dropped the original (arrayBuffer/bytes), created an external string whose finalizer only deref'd the Store (text), or never freed at all (json, formData). Net effect: every s3file.text()/.json()/.arrayBuffer()/.bytes() leaked the entire downloaded object. Switch doReadFromS3 to .temporary (matching the local-file path in read_file.zig) so handlers take ownership and free the buffer. Also make toFormDataWithBytes respect .temporary (it ignored lifetime entirely, leaking for both S3 and local-file reads). Because getFormData was passing .temporary even for in-memory Blobs whose bytes are Store-owned via sharedView(), change it to .clone like getText / getArrayBuffer / getBytes; the file/S3 paths supply their own .temporary internally. The existing s3-text-leak-fixture.js never caught this because Array(n).fill(readLargeFile()) calls the function once and fills n slots with the same promise; fix that too. Add a new credential-free leak test that points S3Client at a local Bun.serve mock and measures RSS growth across 50 1 MiB downloads for each body method.
- s3-body-leak.test.ts: use describe.concurrent (each test spawns an independent subprocess with its own port-0 server), add a formData case so the toFormDataWithBytes .temporary path has the same coverage as the other four body methods, and give each case a 60s timeout for slow debug/ASAN builds. - s3-text-leak-fixture.js: now that it actually issues 100 concurrent reads (vs effectively 1 before), add a gc+sleep settle loop to both phases so the RSS samples are taken under the same conditions, and bump MAX_ALLOWED_MEMORY_USAGE_INCREMENT 15 -> 30 to allow for socket pool / allocator segment growth at the higher concurrency. The per-call leak this guards against would show as ~100 MiB here.
…owth noise The previous 10-warmup / 50-measured split meant allocator segments grew between the two RSS samples even without a leak, occasionally pushing growth past the threshold (observed 27 MiB on text under ASAN). Running the same iteration count for both phases puts the allocator at steady state before the baseline sample, so growth with the fix applied is near zero while the leak still shows as ~iterations MiB in the measured phase. 25/25 pass under ASAN.
s3.leak.test.ts was failing on 5 CI platforms after the previous commit fixed the Array.fill(promise) bug so the fixture actually issued 100 concurrent reads. With 10-concurrent warmup vs 100-concurrent measured, allocator segments / socket pool / TLS buffers for the higher concurrency pushed RSS past the threshold even with the leak fixed. Apply the same approach proven stable in s3-body-leak.test.ts: sequential reads with an equal-sized warmup phase so the allocator is at steady state before the baseline sample. 30 iterations each; the leak shows as ~30 MiB growth, threshold is 15 MiB. Bump the test timeout to 60s for the 60 sequential R2 round-trips under debug builds. Validated against a local mock endpoint: passes with the fix, fails with 'Memory usage is too high' on 1.3.13.
The fixture's .fill(promise) bug fix turned out to need R2 credentials to validate properly; the sequential rewrite passed on 4/5 platforms but ASAN's memory overhead still pushed past the threshold. Since the new s3-body-leak.test.ts covers the same .text() path (plus the other four body methods) without credentials and runs on every platform, the R2-gated fixture is redundant for this leak. Leave it as-is and let the credential-free test carry the regression coverage.
Match the pattern used elsewhere in test/ (memfd-disabled.test.ts, udp_socket.test.ts, etc.) so the test doesn't flake on ASAN lanes that emit 'WARNING: ASAN interferes with JSC signal handlers'.
07636bf to
802b91c
Compare
There was a problem hiding this comment.
The three-way lifetime change (doReadFromS3 → .temporary, toFormDataWithBytes → defer free, getFormData → .clone) checks out against toFormData's in-memory sharedView() path, but since this is interlocked memory-ownership semantics in core Blob code where a slip means UAF/double-free, I'd like a human to give it a final look.
Extended reasoning...
Overview
Two files: src/runtime/webcore/Blob.zig (3 small hunks) and a new credential-free leak test test/js/bun/s3/s3-body-leak.test.ts. The Zig changes are tightly coupled: (1) doReadFromS3's wrapper switches the downloaded body's lifetime from .clone to .temporary so consumers free/adopt it; (2) toFormDataWithBytes now honors .temporary with a defer free(buf); (3) getFormData flips from .temporary to .clone because the in-memory path hands sharedView() (Store-owned bytes) to toFormDataWithBytes, and freeing those would be a UAF now that (2) actually frees. I cross-checked toFormData at Blob.zig:4062–4075 and the routing matches the PR description exactly — file/S3 paths supply their own lifetime via doReadFile/doReadFromS3 and ignore the getFormData value, while the in-memory branch passes it through.
Security risks
None identified. This is purely buffer-lifetime management for downloaded S3 bodies; no auth, crypto, parsing, or untrusted-input surface changes. The risk class here is memory safety (UAF/double-free), not security boundary.
Level of scrutiny
Higher than the diff size suggests. The fix is three lines of substance plus comments, but it changes ownership semantics in a hot path (Blob body readers) and the three edits must remain consistent — reverting any one without the others reintroduces either the leak or a UAF on in-memory blob.formData(). That coupling is well-documented in the new comments and the PR description, and the new test covers all five body methods including formData, but I'd still want a maintainer who owns Blob.zig to confirm the .temporary contract for toStringWithBytes/toArrayBufferViewWithBytes matches what read_file.zig already relies on.
Other factors
All seven of my earlier inline comments across previous revisions have been addressed and resolved (concurrent tests, formData coverage, fixture revert, comment wording, ASAN stderr filter, per-test timeout justified as load-bearing under ASAN). The bug-hunting pass found nothing this round. CI build #49152 shows 4 failures (s3-storage-class, fs.watch, worker_threads segfault, elysia stream) that are unrelated to the touched code paths. A community PR (#29776) independently arrived at the same core .clone → .temporary change, which is a good corroborating signal.
Fixes #29083
What
await s3file.text()/.json()/.arrayBuffer()/.bytes()/.formData()leaked the entire downloaded S3 object body on every call.Reproduce
Cause
On a successful 200/204/206 download,
S3HttpSimpleTaskmoves ownership of its heapresponse_buffer(aMutableStringonbun.default_allocator) into theS3DownloadResultand zeroes its own copy, so the task'sdeinit()frees nothing — the callback owns the allocation.S3BlobDownloadTask.onS3DownloadResolvedaliasesresponse.body.list.itemsand forwards it to theto*WithByteshandlers viadoReadFromS3's wrapper, which was passingLifetime.clone..clonesemantics assume the bytes are backed by the Blob'sStore. An S3Storeonly holds path + credentials, not the bytes. So under.clone:toArrayBufferViewWithBytescopied into a newArrayBufferand orphaned the originaltoStringWithBytescreated an external string whose finalizer (Store.external) only derefs the S3 StoretoJSONWithBytesfreed only on.temporarytoFormDataWithBytesignored the lifetime entirelyNothing freed the download buffer.
Fix
doReadFromS3: pass.temporaryinstead of.clone, matching the local-file path inread_file.zig. Consumers then adopt or free the allocation.toFormDataWithBytes: respect.temporary(FormData.toJScopies all slices out, so the input can be freed afterwards). This also fixes the same leak forBun.file(path).formData().getFormData: pass.cloneinstead of.temporary. For in-memory Blobs,toFormDatahandssharedView()(Store-owned bytes) totoFormDataWithBytes;.temporarythere was always wrong but was masked bytoFormDataWithBytesignoring lifetime. File/S3 paths supply their own.temporaryinternally and ignore this value. This brings it in line withgetText/getArrayBuffer/getBytes.Tests
New
test/js/bun/s3/s3-body-leak.test.ts: pointsS3Clientat a localBun.servemock (no credentials needed), downloads a 1 MiB object 50× per method, and asserts RSS growth stays well under one-buffer-per-iteration. CoversarrayBuffer/bytes/text/json/formData.test/js/web/html/FormData.test.tspasses (110/111; the one remaining timeout is pre-existing on main).The existing credential-gated
s3-text-leak-fixture.jswas never actually exercising this (it usednew Array(N).fill(readLargeFile()), which calls the function once and fills N slots with the same promise). Left as-is since the new credential-free test provides strictly better coverage across all platforms.Related
#20487 — this PR fixes the
Bun.S3Clientportion of that report; the@google-cloud/storageSDK path is unrelated code and not touched here.Overlaps with #29776 (community PR from a fork, CI blocked) which makes the same core
.clone→.temporarychange. This PR differs in that its test uses a localBun.servemock S3 endpoint so it runs without real S3 credentials or Docker.